Fuzzy Clustering of Parallel Data Streams
نویسندگان
چکیده
The management and processing of so-called data streams has recently become a topic of active research in several fields of computer science, notably database systems and data mining. A data stream can roughly be thought of as a transient, continuously increasing sequence of time-stamped data. In this paper, we consider the problem of clustering parallel streams of real-valued data, that is to say, continuously evolving time series. More specifically, we are interested in grouping data streams the evolution over time of which is similar in a specific sense. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating not more than a constant time delay. For this purpose, we develop an efficient online version of the fuzzy C-means clustering algorithm. A fuzzy approach appears to be particularly useful for this type of application, in which the clustering structure is subject to continuous changes.
منابع مشابه
Online-Data-Mining auf Datenströmen: Methoden zur Clusteranalyse und Klassifikation
• J. Beringer and E. Hüllermeier. Efficient instance based learning on data streams. Adaptive optimization of the number of clusters in fuzzy clustering. Fuzzy clustering of parallel data streams. Adaptive optimization of the number of clusters in fuzzy clustering.
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملIncorporation of Non-euclidean Distance Metrics into Fuzzy Clustering on Graphics Processing Units
Computational tractability of clustering algorithms becomes a problem as the number of data points, feature dimensionality, and number of clusters increase. Graphics Processing Units (GPUs) are low cost, high performance stream processing architectures used currently by the gaming, movie, and computer aided design industries. Fuzzy clustering is a pattern recognition algorithm that has a great ...
متن کاملOnline fuzzy medoid based clustering algorithms
This paper describes two new online fuzzy clustering algorithms based on medoids. These algorithms have been developed to deal with either very large datasets that do not fit in main memory or data streams in which data are produced continuously. The innovative aspect of our approach is the combination of fuzzy methods, which are well adapted to outliers and overlapping clusters, with medoids a...
متن کامل